Camp Springs
Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
Xu, Huilin, Liu, Zhuoyang, Luomei, Yixiang, Xu, Feng
Aerial Vision-and-Language Navigation (VLN) aims to enable unmanned aerial vehicles (UAVs) to interpret natural language instructions and navigate complex urban environments using onboard visual observation. This task holds promise for real-world applications such as low-altitude inspection, search-and-rescue, and autonomous aerial delivery. Existing methods often rely on panoramic images, depth inputs, or odometry to support spatial reasoning and action planning. These requirements increase system cost and integration complexity, thus hindering practical deployment for lightweight UAVs. We present a unified aerial VLN framework that operates solely on egocentric monocular RGB observations and natural language instructions. The model formulates navigation as a next-token prediction problem, jointly optimizing spatial perception, trajectory reasoning, and action prediction through prompt-guided multi-task learning. Moreover, we propose a keyframe selection strategy to reduce visual redundancy by retaining semantically informative frames, along with an action merging and label reweighting mechanism that mitigates long-tailed supervision imbalance and facilitates stable multi-task co-training. Extensive experiments on the Aerial VLN benchmark validate the effectiveness of our method. Under the challenging monocular RGB-only setting, our model achieves strong results across both seen and unseen environments. It significantly outperforms existing RGB-only baselines and narrows the performance gap with state-of-the-art panoramic RGB-D counterparts. Comprehensive ablation studies further demonstrate the contribution of our task design and architectural choices.
- Asia > China > Shanghai > Shanghai (0.05)
- Asia > China > Hubei Province > Wuhan (0.04)
- North America > United States > Maryland > Prince George's County > Greenbelt (0.04)
- (5 more...)
- Transportation (0.46)
- Government > Regional Government (0.46)
- Information Technology > Robotics & Automation (0.34)
Global Precipitation Nowcasting of Integrated Multi-satellitE Retrievals for GPM: A U-Net Convolutional LSTM Architecture
Rahimi, Reyhaneh, Ebtehaj, Ardeshir, Behrangi, Ali, Tan, Jackson
This paper presents a deep learning architecture for nowcasting of precipitation almost globally every 30 min with a 4-hour lead time. The architecture fuses a U-Net and a convolutional long short-term memory (LSTM) neural network and is trained using data from the Integrated MultisatellitE Retrievals for GPM (IMERG) and a few key precipitation drivers from the Global Forecast System (GFS). The impacts of different training loss functions, including the mean-squared error (regression) and the focal-loss (classification), on the quality of precipitation nowcasts are studied. The results indicate that the regression network performs well in capturing light precipitation (below 1.6 mm/hr), but the classification network can outperform the regression network for nowcasting of precipitation extremes (>8 mm/hr), in terms of the critical success index (CSI).. Using the Wasserstein distance, it is shown that the predicted precipitation by the classification network has a closer class probability distribution to the IMERG than the regression network. It is uncovered that the inclusion of the physical variables can improve precipitation nowcasting, especially at longer lead times in both networks. Taking IMERG as a relative reference, a multi-scale analysis in terms of fractions skill score (FSS), shows that the nowcasting machine remains skillful (FSS > 0.5) at the resolution of 10 km compared to 50 km for GFS. For precipitation rates greater than 4~mm/hr, only the classification network remains FSS-skillful on scales greater than 50 km within a 2-hour lead time.
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States > Minnesota (0.04)
- Southern Ocean (0.04)
- (12 more...)
Operator Guidance Informed by AI-Augmented Simulations
Edwards, Samuel J., Levine, Michael
Operational guidance is provided in the form of selection of speeds and headings, and is generally based on accessing ship motions response predictions from a pre-computed database or lookup table for a given condition. Operational guidance is an important consideration in the survival of a ship and has been the focus of many International Maritime Organization (IMO) publications, IMO (1995), IMO (2007), IMO (2020). Recommendations for ship-specific operational guidance has been developed and discussed in the interim guidelines of the Second Generation Intact Stability by IMO, IMO (2020). While these guidelines are certainly useful in design and at sea, they are not comprehensive. The ocean environment is random and complex.
- Europe > Norway > Western Norway > Vestland > Bergen (0.05)
- North America > United States > Maryland > Montgomery County > Bethesda (0.05)
- North America > United States > District of Columbia > Washington (0.05)
- (4 more...)